148 research outputs found
From gene trees to species trees II: Species tree inference in the deep coalescence model
When gene copies are sampled from various species, the resulting gene tree
might disagree with the containing species tree. The primary causes of gene
tree and species tree discord include lineage sorting, horizontal gene
transfer, and gene duplication and loss. Each of these events yields a
different parsimony criterion for inferring the (containing) species tree from
gene trees. With lineage sorting, species tree inference is to find the tree
minimizing extra gene lineages that had to coexist along species lineages; with
gene duplication, it becomes to find the tree minimizing gene duplications
and/or losses. In this paper, we show the following results: (i) The deep
coalescence cost is equal to the number of gene losses minus two times the gene
duplication cost in the reconciliation of a uniquely leaf labeled gene tree and
a species tree. The deep coalescence cost can be computed in linear time for
any arbitrary gene tree and species tree. (ii) The deep coalescence cost is
always no less than the gene duplication cost in the reconciliation of an
arbitrary gene tree and a species tree. (iii) Species tree inference by
minimizing deep coalescences is NP-hard.Comment: 17 pages 6 figure
On Tree Based Phylogenetic Networks
A large class of phylogenetic networks can be obtained from trees by the
addition of horizontal edges between the tree edges. These networks are called
tree based networks. Reticulation-visible networks and child-sibling networks
are all tree based. In this work, we present a simply necessary and sufficient
condition for tree-based networks and prove that there is a universal tree
based network for each set of species such that every phylogenetic tree on the
same species is a base of this network. The existence of universal tree based
network implies that for any given set of phylogenetic trees (resp. clusters)
on the same species there exists a tree base network that display all of them.Comment: 17 pages, 6 figure
Bounding the Size of a Network Defined By Visibility Property
Phylogenetic networks are mathematical structures for modeling and
visualization of reticulation processes in the study of evolution. Galled
networks, reticulation visible networks, nearly-stable networks and
stable-child networks are the four classes of phylogenetic networks that are
recently introduced to study the topological and algorithmic aspects of
phylogenetic networks. We prove the following results.
(1) A binary galled network with n leaves has at most 2(n-1) reticulation
nodes. (2) A binary nearly-stable network with n leaves has at most 3(n-1)
reticulation nodes. (3) A binary stable-child network with n leaves has at most
7(n-1) reticulation nodes.Comment: 23 pages, 9 figure
Locating a Tree in a Reticulation-Visible Network in Cubic Time
In this work, we answer an open problem in the study of phylogenetic
networks. Phylogenetic trees are rooted binary trees in which all edges are
directed away from the root, whereas phylogenetic networks are rooted acyclic
digraphs. For the purpose of evolutionary model validation, biologists often
want to know whether or not a phylogenetic tree is contained in a phylogenetic
network. The tree containment problem is NP-complete even for very restricted
classes of networks such as tree-sibling phylogenetic networks. We prove that
this problem is solvable in cubic time for stable phylogenetic networks. A
linear time algorithm is also presented for the cluster containment problem.Comment: 25 pages, 3 figure
Generating Normal Networks via Leaf Insertion and Nearest Neighbor Interchange
Galled trees are studied as a recombination model in theoretic population
genetics. This class of phylogenetic networks has been generalized to
tree-child networks, normal networks and tree-based networks by relaxing a
structural condition. Although these networks are simple, their topological
structures have yet to be fully understood. It is well-known that all
phylogenetic trees on taxa can be generated by the insertion of the -th
taxa to each edge of all the phylogenetic trees on taxa. We prove that
all tree-child networks with reticulate nodes on taxa can be uniquely
generated via three operations from all the tree-child networks with or
reticulate nodes on taxa . An application of this result is found in
counting tree-child networks and normal networks. In particular, a simple
formula is given for the number of rooted phylogenetic networks with one
reticulate node.Comment: 4 figures and 13 page
Counting and Enumerating Galled Networks
Galled trees are widely studied as a recombination model in population
genetics. This class of phylogenetic networks is generalized into galled
networks by relaxing a structural condition. In this work, a linear recurrence
formula is given for counting 1-galled networks, which are galled networks
satisfying the condition that each reticulate node has only one leaf
descendant. Since every galled network consists of a set of 1-galled networks
stacked one on top of the other, a method is also presented to count and
enumerate galled networks.Comment: 7 figures, 2 table
Locating a Phylogenetic Tree in a Reticulation-Visible Network in Quadratic Time
In phylogenetics, phylogenetic trees are rooted binary trees, whereas
phylogenetic networks are rooted arbitrary acyclic digraphs. Edges are directed
away from the root and leaves are uniquely labeled with taxa in phylogenetic
networks. For the purpose of validating evolutionary models, biologists check
whether or not a phylogenetic tree is contained in a phylogenetic network on
the same taxa. This tree containment problem is known to be NP-complete. A
phylogenetic network is reticulation-visible if every reticulation node
separates the root of the network from some leaves. We answer an open problem
by proving that the problem is solvable in quadratic time for
reticulation-visible networks. The key tool used in our answer is a powerful
decomposition theorem. It also allows us to design a linear-time algorithm for
the cluster containment problem for networks of this type and to prove that
every galled network with n leaves has 2(n-1) reticulation nodes at most.Comment: The journal version of arXiv:1507.02119v
Analyzing the Accuracy of the Fitch Method for Reconstructing Ancestral States on Ultrametric Phylogenies
Recurrence formulas are presented for studying the accuracy of the Fitch
method for reconstructing the ancestral states in a given phylogenetic tree. As
their applications, we analyze the convergence of the accuracy of
reconstructing the root state in a complete binary tree of as goes to
infinity and also give a lower bound on the accuracy of reconstructing the root
state in an ultrametric tree.Comment: 14page
Counting Tree-Child Networks and Their Subclasses
Galled trees are studied as a recombination model in population genetics.
This class of phylogenetic networks is generalized into tree-child, galled and
reticulation-visible network classes by relaxing a structural condition imposed
on galled trees. We count tree-child networks through enumerating their
component graphs. Explicit counting formulas are also given for galled trees
through their relationship to ordered trees, phylogenetic networks with few
reticulations and phylogenetic networks in which the child of each reticulation
is a leaf.Comment: 24 pages, 2 tables and 9 figure
The compressions of reticulation-visible networks are tree-child
Rooted phylogenetic networks are rooted acyclic digraphs. They are used to
model complex evolution where hybridization, recombination and other
reticulation events play important roles. A rigorous definition of network
compression is introduced on the basis of the recent studies of the
relationships between cluster, tree and rooted phylogenetic network. The
concept reveals another interesting connection between the two well-studied
network classes|tree-child networks and reticulation-visible networks|and
enables us to define a new class of networks for which the cluster containment
problem has a linear-time algorithm.Comment: 18 pages, 4 figure
- β¦